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Abstract 



Kleene algebra with tests is an extension of Kleene algebra, the algebra of regular expres- 
sions, which can be used to reason about programs. We develop a coalgebraic theory of Kleene 
algebra with Tests, along the lines of the coalgebraic theory of regular expressions based on de- 
terministic automata. Since the known automata-theoretic presentation of KJeene algebra with 
tests does not lend itself to a coalgebraic theory, we define a new interpretation of Kleene alge- 
bra with tests expressions and a corresponding automata-theoretic presentation. One outcome 
of the theory is a coinductive proof principle, that can be used to establish equivalence of our 
Kleene algebra with tests expressions. 

1 Introduction 

Kleene algebra (KA) is the algebra of regular expressions [Conway 1971; Kleene 1956]. As is well 
known, the theory of regular expressions enjoys a strong connection with the theory of finite-state 
automata. This connection was used by Rutten [1998] to give a coalgebraic treatment of regular 
expressions. One of the fruits of this coalgebraic treatment is coinduction, a proof technique for 
demonstrating the equivalence of regular expressions [Rutten 2000]. Other methods for proving 
the equality of regular expressions have previously been established — for instance, reasoning by 
using a sound and complete axiomatization [Kozen 1994; Salomaa 1966], or by minimization of 
automata representing the expressions [Hopcroft and UUman 1979]. However, the coinduction 
proof technique can give relatively short proofs, and is fairly simple to apply. 

Recently, Kozen [1997] introduced Kleene algebra with tests (KAT), an extension of KA de- 
signed for the particular purpose of reasoning about programs and their properties. The regular 
expressions of KAT allow one to intersperse boolean tests along with program actions, permitting 
the convenient modelling of programming constructs such as conditionals and while loops. The util- 
ity of KAT is evidenced by the fact that it subsumes propositional Hoare logic, providing a complete 
deductive system for Hoare-style inference rules for partial correctness assertions [Kozen 1999]. 

The goal of this paper is to develop a coalgebraic theory of KAT, paralleling the coalgebraic 
treatment of KA. Our coalgebraic theory yields a coinductive proof principle for demonstrating the 
equality of KAT expressions, in analogy to the coinductive proof principle for regular expressions. 

*This paper is essentially the same as one that will appear in Theoretical Computer Science. A preliminary version ap- 
peared in the Proceedings of the Sixth International Workshop on Coalgebraic Methods in Computer Science, Electronic 
Notes in Theoretical Computer Science, Volume 82.1, 2003. 
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The development of our coalgebraic theory proceeds as follows. We first introduce a form of de- 
terministic automaton and define the language accepted by such an automaton. Next, we develop 
the theory of such automata, showing that coinduction can be applied to the class of languages rep- 
resentable by our automata. We then give a class of expressions, which play the same role as the 
regular expressions in classical automata theory, and fairly simple rules for computing derivatives 
of these expressions. 

The difficulty of our endeavor is that the known automata-theoretic presentation of KAT [Kozen 
2003] does not lend itself to a coalgebraic theory. Moreover, the notion of derivative, essential to 
the coinduction proof principle in this context, is not readily definable for KAT expressions as they 
are defined by Kozen [1997]. Roughly, these difficulties arise from tests being commutative and 
idempotent, and suggest that tests need to be handled in a special way. In order for the coalgebraic 
theory to interact smoothly with tests, we introduce a type system along with new notions of strings, 
languages, automata, and expressions, which we call mixed strings, mixed languages, mixed au- 
tomata, and mixed expressions, respectively. (We note that none of these new notions coincide with 
those already developed in the theory of KAT.) All well-formed instances of these notions can be 
assigned types by our type system. Our type system is inspired by the type system devised by Kozen 
[1998, 2002] for KA and KAT, but is designed to address different issues. 

This paper is structured as follows. In the next section, we introduce mixed strings and mixed 
languages, which will be used to interpret our mixed expressions. In Section 3, we define a notion of 
mixed automaton that is used to accept mixed languages. We then impose a coalgebraic structure on 
such automata. In Section 4, we introduce a sufficient condition for proving equivalence that is more 
convenient than the condition that we derive in Section 3. In Section 5, we introduce our type system 
for KAT, and connect typed KAT expressions with the mixed language they accept. In Section 6, we 
give an example of how to use the coalgebraic theory, via the coinductive proof principle, to establish 
equivalence of typed KAT expressions. In Section 7, we show that our technique is complete, that 
is, it can establish the equivalence of any two typed KAT expressions that are in fact equivalent. We 
conclude in Section 8 with considerations of future work. 

2 Mixed Languages 

In this section, we define the notions of mixed strings and mixed languages that we will use through- 
out the paper. Mixed strings are a variant of the guarded strings introduced by Kaplan [1969] as an 
abstract interpretation for program schemes; sets of guarded strings were used by Kozen [2003] as 
canonical models for Kleene algebra with tests. Roughly speaking, a guarded string can be under- 
stood as a computation where atomic actions are executed amidst the checking of conditions, in the 
form of boolean tests. Mixed strings will be used as an interpretation for the mixed expressions we 
introduce in Section 5. 

Mixed strings are defined over two alphabets: a set of primitive programs (denoted V) and a 
set of primitive tests (denoted B). We allow V to be infinite, but require that B be finite. (We will 
see in Section 3 where this finiteness assumption comes in. Intuitively, this is because our automata 
will process each primitive test individually.) Primitive tests can be put together to form more 
complicated tests. A literal I is a primitive test fe G 6 or its negation 5; the underlying primitive 
test h is said to be the base of the literal, and is denoted by base{l). When A is a subset of B, 
lit{A) denotes the set of all literals over A. A test is a nonempty set of literals with distinct bases. 
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Intuitively, a test can be understood as the conjunction of the Uterals it comprises. The base of a test 
t, denoted by base{t), is defined to be the set {base{l) : I € t}, in other words, the primitive tests 
the test t is made up from. We extend the notion of base to primitive programs, by defining the base 

of a primitive program p £ V a& 0. 

Example 2.1: Let V = {p, q}, and B = {b, c, d}. The literals lit{B) of B are {b,b, c,c, d,d}. Tests 
include {b, c, d} and {b, d}, but {b, b, c} is not a test, as b and b have the same base 6. The base of 
{b,c,d} is {b,c,d}. I 

Primitive programs and tests are used to create mixed strings. A mixed string is either the empty 
string, denoted by e, or a sequence cr = ai . . . a„ (where n > 1) with the following properties: 

(1) each ai is either a test or primitive program, 

(2) for i = 1, . . . , n — 1, if O j is a test, then Oj+i is a primitive program, 

(3) for i = 1, . . . , — 1, if ttj is a primitive program, then Oj+i is a test, and 

(4) for z = 2, . . . , n — 1, if Oj is a test, then base{ai) = B. 

Hence, a mixed string is an alternating sequence of primitive programs and tests, where each test 
in the sequence is a "complete" test, except possibly if it occurs as the first or the last element of 
the sequence. This allows us to manipulate mixed strings on a finer level of granularity; we can 
remove literals from the beginning of a mixed strings and still obtain a mixed string. The length of 
the empty mixed string e is 0, while the length of a mixed string oi . . . a„ is n. 

Example 2.2: Let V = {p, q), and B = {6, c, d}. Mixed strings include e (of length 0), {6} and 
p (both of length 1), and {b}p{b, c, d}q{d} (of length 5). The sequence {b}p{b, d}q{d} is not a 
mixed string, since base{{b, d}) B. M 

We define the concatenation of two mixed strings a and a', denoted by a ■ a', as follows. If one 

of a, a' is the empty string, then their concatenation is the other string. If both cr = oi . . . a„ and 
a' = bi . . .bjn have non-zero length, their concatenation is defined as: 

(1) r = ai . . . a„6i . . . if exactly one of a„, bi is a primitive program and r is a mixed stting; 

(2) T = ai ... a„_i(o„ U 61)62 . . . 6rn if and 61 are tests such that base{an) n base{bi) = 
and r is a mixed string; and is 

(3) undefined otherwise. 

Intuitively, concatenation of the two strings is obtained by concatenating the sequence of string 
elements, possibly by combining the last test of the first string with the first test of the second string, 
provided that the result is a valid mixed string. We note that concatenation of strings is an associative 
operation. 

Example 2.3: Let V = {p, q}, and B = {b, c, d}. The concatenation of the mixed strings p and 
{b, c, d}q is p{b, c, d}q. Similarly, the concatenation of the mixed strings {b}p{b,c} and {d}q{d} 
is the mixed string {b}p{b, c, d}q{d}. However, the concatenation of {b}p{b, c} and {6, d}q is not 
defined, as {b,c} fl {b,d} ^ 0. The concatenation of {b}p{b,c} and q is also not defined, as 
base{{b, c}) ^ B, and thus {b}p{b, c}q is not a mixed string. | 
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We assign one or more types to mixed strings in the following way. A type is of the form 
A ^ B, where A and B are subsets of B. Intuitively, a mixed string has type A ^ B if the first 
element of the string has base A, and it can be concatenated with an element with base B. It will be 
the case that a mixed string of type A ^ B can be concatenated with a mixed string of type B ^ C 
to obtain a mixed string of type A^ C. 

The mixed string e has many types, namely it has type A ^ A, for all A G p{S). A mixed 
string of length 1 consisting of a single test t has type base{t) U A A, for any A € p{13) such 
that A n base{t) = 0. A mixed string of length 1 consisting of a single program p has type ^ B. 
A mixed string ai . . . a„ of length n > 1 has type base{ai) B\ base{an). 

Example 2.4: Let V = {p, q}, and B = {b, c, d}. The mixed string p{b,c, d} has type 0^0. 
The mixed string {d}p has type {d} — B. The mixed string {b}p{b,c, d}q{b,c} has type {6} — >■ 
{d}. The concatenation of {b}p{b, c, d}q{b, c} and {d}p, namely {b}p{b, c, d}q{b, c, d}p, has type 
{b}^B. I 

A mixed language is a set of mixed strings, and is typeable, with type A ^ B, if all of the 
mixed strings it contains have type A ^ B.ln this paper, we will only be concerned with typeable 
mixed languages. 

We will be interested in different operations on mixed languages in the following sections. When 
Li,L2, and L are mixed languages, we use the notation Li ■ L2 to denote the set {ai • (72 : o"i G 

Li, (72 G L2}, to denote the set {e}, and for n > 1, L" to denote the set L ■ L""^. The following 
two operations will be useful in Section 5. The operator T, defined by 

T{L) = {(7 : (7 G -L, |(7| = 1, (7 is a test} 

extracts from a language all the mixed strings made up of a single test. The operator e, defined by 

e(L) = Ln{e} 

essentially checks if the empty mixed string e is in L, since e(L) is nonempty if and only if the 
empty mixed string is in L. 

3 Mixed Automata 

Having introduced a notion of mixed strings, we now define a class of deterministic automata that 
can accept mixed strings. Mixed strings enforce a strict alternation between programs and tests, 
and this alternation is reflected in our automata. The transitions of the automata are labelled with 
primitive programs and literals. Given a mixed string, mixed automaton can process the tests in the 
string in many different orders; this reflects the fact that the tests that appear in mixed strings are 
sets of literals. 

A mixed automaton over the set of primitive programs P and set of primitive tests B is a 3-tuple 

M = {{SA)Aep(B)iO, {Sa)a£p{B))' consisting of a set Sa of states for each possible base A ^ 
of a test as well as a set S0 of program states, an output function 0:80^ {0, 1}, and transition 
functions 60 : S0 x V ^ Sb and (for A ^ 0) Sa ■ Sa x Ut{A) UAep(B) subject to the 
following two conditions: 

Al. 6a{s,1) G 5'^\{6ase(0}' and 



4 



A2. for every state s in Sa, for every test t with base A, and for any two orderings {xi,... , Xm), 
{yii - ■ ■ ) Vm) of the literals in t, if s . . . si and s . . . S2 then si = S2. 

(For convenience, we write s — ^ s' if I) = s' for ^ the base of s.) 

We give an example of a mixed automaton in Example 3.2. Intuitively, a state in 5^ can process 
a mixed string of type A B, for some B. Condition Al enforces the invariant that, as a string is 
being processed, the current state is in Sa, for A the base of the first element of the string. Condition 
A2 is a form of "path independence": regardless of the order in which we process the literals of a 
test, we end up in the same program state. Condition A2, and basing transitions on literals rather 
than tests, allow the manipulation of mixed expressions at a finer level of granularity. This is related 
to a similar choice we made when allowing mixed strings to start with a test that is not "complete". 
This flexibility will be useful when we analyze mixed expressions in Section 5. 

The accepting states are defined via the output function o(s), viewed as a characteristic function. 
Accepting states are in S0. 

As in the coalgebraic treatment of automata [Rutten 1998], and contrary to standard definitions, 
we allow both the state spaces Sa and the set V of primitive programs to be infinite. We also do not 
force mixed automata to have initial states, for reasons that will become clear. 

We now define the mixed language accepted by a state of a mixed automaton. Call a sequence 
H = ei . . .Cm of primitive programs and literals a linearization of a mixed string cr = ai ... a„ if 
/Lt can be obtained from a by replacing each test in a with a sequence of length \ai\ containing 
exactly the literals in Oj. 

Example 3.1: Let V = {p,q}, and B = {b,c}. The mixed string {b}p{b,c}q{b,c} (of type 
{b} — > 0) has four hnearizations: bpbcqbc, bpcbqbc, bpbcqcb, and bpcbqcb. | 

Intuitively, a mixed string a is accepted by an automaton if a Unearization of a is accepted by 
the automaton according to the usual definition. Formally, a mixed string a is accepted by a state s 
of an automaton M if either 

(1) cr is e and s is a program state with o{s) = 1 (i.e., s is an accepting program state), or 

(2) there exists a Unearization ei ... of cr such that s . . . s', s' is a program state, 
and o(s') = 1. 

If a is accepted (by a state s) in virtue of satisfying the second criterion, then every Unearization is 
a witness to this fact — in other words, the existential quantification in the second criterion could 
be replaced with a universal quantification (over all linearizations of a) without any change in the 
actual definition. This is because of condition A2 in the definition of a mixed automaton. 

We define the mixed language accepted by state s of automaton M, written Lm{s), as the set 
of mixed strings accepted by state s of M. It is easy to verify that all the strings accepted by a state 
have the same type, namely, if s is in Sa, then every string in Lm{s) has type A ^ 0, and hence 
Lm{s) has type A —>■ 0. 

Example 3.2: Let V = {p, q}, and B = {b, c}. Consider the mixed automaton over V and B 
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Figure 1: A mixed automaton 



pictured in Figure 1, given by M = ((S'a)agp(B), o, {Sa)agp{B))^ where: 

^{b,c} = {•52,{fe,c}) •5sink,{6,c}} 
S{b} = {Sl,{b}, S2,{b}i Ssink,{b}} 
S{c} = {•52,{c}) ^sink,{c}} 
S0 = {Sl,0, S2,05 Ssmk,0} 

and 

0(51,0) = 1 
0(52,0) = 1 

o(Ssink,0) = 0. 

The transition function 6a can be read off from Figure 1 ; note that the sink states Ssink,yl as well as the 
transitions to the sink states are not pictured. Intuitively, any transition not pictured in the automaton 
can be understood as going to the appropriate sink state. For instance, we have <5{6,c}(s2,{6,c}) c) = 
"5sink,{b}- We can check that the two conditions Al and A2 hold in M. The language accepted by 
state is -^^1^(51, {6}) = {{b}, {b}p{b,c}}. The language accepted by state si,^ is Lm{si,0) = 
{e,p{b,c}}. I ' 

We define a homomorphism between mixed automata M and M' to be a family / = (/a) Aep(S) 
of functions fA'-SA-^ S'^ such that: 
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(1) for all s e S0, o{s) = o'{f0{s)), and for all p G V, /b((^0(s,p)) = <50(/0(s),p), 

(2) for all s G Sa (where A ^ 0) and all I G UtiA), fA\{base(i)}i^Ais, I)) = 0- 

A homomorphism preserves accepting states and ttansitions. We write f : M ^ M' when / is a 
homomorphism between automata M and M'. For convenience, we often write f{s) for when 
the type ^4 of s is understood. It is straightforward to verify that mixed automata form a category 
(denoted M.A), where the morphisms of the category are mixed automata homomorphisms. 

We are interested in identifying states that have the same behaviour, that is, that accept the same 
mixed language. A bisimulation between two mixed automata M = {{Sa)a£^{b)^o, {SA)Aep(B)) 
andM' = ((S'^)^gp(B), o', is afamily of relations {RA)Aep{B) where i?A QSaxS'^ 

such that the following two conditions hold: 

(1) for alls G 50 and s' G if si?0s', then o(s) = o'(s') andforallp G V, S0{s,p)RbS'0{s' ,p), 
and 

(2) for all s G S'a and s' G 5^ (where A 7^ 0), if sRas', then for all I G lit{A), (5a(s, l)RA\{base{i)}SAi^' , 0- 

A bisimulation between M and itself is called a bisimulation on M. Two states s and s' of M 
having the same type B are said to be bisimilar, denoted by s ~m s' , if there exists a bisimulation 
{Ra) A&p{B) such that sRbs' . (We simply write s ~ s' when M is clear from the context.) For each 
M, the relation ~m is the union of all bisimulations on M, and in fact is the greatest bisimulation 
on M. 

Proposition 3.3: If s is a state of M and s' is a state of M' with s ~ s', then Lm{s) = Lm'{s'). 

Proof: We show, by induction on the length of mixed strings that for all mixed strings a, and for all 
states s, s' such that s s', then a G Lm{s) if and only if a G Lm' {s')- For the empty mixed string 
e, we have e G Lm{s) if and only if o{s) = 1 if and only if o'(s') = 1 (by definition of bisimilarity) 
if and only if e G Lm' (s'). Assume inductively that the results holds for mixed strings of length n. 
Let (J be a mixed string of length n + 1, of the form aa' . Assume a G Lm{s). By definition, there 
is a linearization ei . . . of a and a state si such that s . . . si and a' G Lm{si). By the 
definition of bisimilar states, we have s' . . . s[ and si ~ By the induction hypothesis, 
a' G Lm'{s'i). By the choice of s'l, we have that a G Lm'{s'), as desired. I 

Conditions (1) and (2) of the definition of a bisimulation are analogous to the conditions in the 
definition of a homomorphism. Indeed, a homomorphism can be viewed as a bisimulation. 

Proposition 3.4: If f : M M' is a mixed automataon homomorphism, then {Ra) A&p{B)' defined 
by Ra = {(s, fA{s)) ■ s G Sa} is a bisimulation. 

Proof: First, for all s G 5*0, sR^s' implies s' = f0{s), and o(s) = o'{f0{s)) = o'(s'). Moreover, 
for allp G V, we have 60(3', p) = S'0(f0{s),p) = fB{^0{s,l)), so that 60{s,1)Rb5'0{s' ,1), as 
required. Similarly, let s & Sa (where A ^ 0); sRas' implies s' = fA{s), and thus for all 

I G lit{A), S'^{s',l) = 5'^{fA{s),l) = fA\{base{l)}{^A{s,l)), SO that 0^A\{6ase(0}<^A(«'' 0' 

as required, proving that (-Ra) Aep(B) is a bisimulation. | 

An immediate consequence of this relationship is that homomorphisms preserve accepted lan- 
guages. 
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Proposition 3.5: If f : M — ^ M' is a mixed automaton homomorphism, then Lm{s) = -Lm'(/(s)) 
for all states s ofM. 

Proof: Immediate from Propositions 3.4 and 3.3. | 

It turns out that we can impose a mixed automaton structure on the set of all mixed languages 
with type A ^ 0. We take as states mixed languages of type A ^ 0. A state is accepting if the 
empty string e is in the language. It remains to define the transitions between states; we adapt the 
idea of Brzozowski derivatives [Brzozowski 1964]. Our definition of derivative depends on whether 
we are taking the derivative with respect to a program element or a literal. 

If the mixed language L has type ^ B and p eVisa primitive program, define 

Dp{L) = {a : p-a € L}. 

If the mixed language L has type A^ B (for A^ 0) and I G lit{A) is a literal, then 

Di{L) = {a : {/} • a G L}. 

Define £yi to be the set of mixed languages oftype^ 0. Define£tobe {{Ca)a£p{B)^oc, {^A)Aep{B))' 
where oc{L) = 1 if e G L, and otherwise; 60{L,p) = Dp{L); and Sa{L, I) = Di{L), for A ^ 
and I G lit (A). It is easy to verify that JC is indeed a mixed automaton. The following properties of 
£ are significant. 

Proposition 3.6: For a mixed automaton M with states {SA)Aep{B)> the maps Ja'-Sa^J^ 
mapping a state s in Sa to the language Lm{s) form a mixed automaton homomorphism. 

Proof: We check the two conditions for the family {fA)Aep{B) ^ be a homomorphism. First, given 
s G S0, o{s) = 1 if and only if e G Lm{s), which is equivalent to oc{f0{s)) = 1. Moreover, given 

p e B, fB{S0is,p)) = Lm{50{s,p)) = {a : p - a E Lm{s)} = Dp{Lm{s)) = Dp{f0{s)), 
as required. Similarly, given s e Sa (where A ^ 0), and I G lit{A), fA\{base{i)}{SA{s,l)) = 
Lm{Sa{s,1)) = {a : {1} ■ a & Lm{s)} = Di{Lm{s)) = as required. I 

Proposition 3.7: For any mixed language L in C, the mixed language accepted by state L in C is L 
itself that is, Lc{L) = L. 

Proof: We prove by induction on the length of linearizations of a that for all mixed sttings a, 
a e Lif and only if cr G Lc{L). For the empty mixed string e, we have e G L 44> oc{L) = 1 4^ e G 
Lc{L). For a of the form pa', we have a = p-a', and thus we have p-a' G L <^ a' G Dp{L), which 
by the induction hypothesis holds if and only if a' G Lc{Dp{L)) ^ a' <E Dp{Lc{L)) (because 
Lc is a mixed automaton homomorphism from C to C), which is just equivalent to p • a' G Lc{L). 
For a with a linearization le\ . . . Cm, letting a' denote a string with Unearization ei . . . Cm, we have 
a = {1} - a', and we can derive in an exactly similar manner that {1} - a' E L <^ a' E Di{L) <^ 
a' G Lc{Di{L)) ^a' e Di{Lc{L)) ^ {/} • a' G Lc{L) ^ae Lc{L). | 

These facts combine into the following fundamental property of £., namely, that £ is a final 
automaton. 
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Theorem 3.8: L is final in the category MA, that is, for every mixed automaton M, there is a 
unique homomorphism from M to C 

Proof: Let M be a mixed automaton. By Proposition 3.6, there exists a homomorphism / from M 
to the final automaton C, mapping a state s to the language Lm{s) accepted by that state. Let /' be 
another homomorphism from M to C. To estabUsh uniqueness, we need to show that for any state 
s of M, we have f{s) = f'{s): 

f{s) = Lm{s) (by definition of /) 
= Lcif'{s)){hy Proposition 3.5) 
= /'(s) (by Proposition 3.7). 

Hence, / is the required unique homomorphism. | 

The finality of £ gives rise to the following coinduction proof principle for language equaUty, 
in a way which is by now standard [Rutten 2000]. 

Corollary 3.9: For two mixed languages K and L of type A ^ 0, if K <^ L then K = L. 

In other words, to establish the equality of two mixed languages, it is sufficient to exhibit a 
bisimulation between the two languages when viewed as states of the final automaton £. In the 
following sections, we will use this principle to analyze equaUty of languages described by a typed 
form of KAT expressions. 

4 Pseudo-Bisimulations 

The "path independence" condition (A2) in the definition of a mixed automaton gives mixed au- 
tomata a certain form of redundancy. It turns out that due to this redundancy, we can define a 
simpler notion than bisimulation that still lets us establish the bisimilarity of states. 

A pseudo-bisimulation (relative to the ordering 5i, . . . , of the primitive tests in B) between 
two mixed automata M = {{Sa) a&p(B):0, {5a)a&p(B)) and M' = {{S'^)a&p{b),o', 
is a family of relations (i?i)j=o \B\ where i?, C x (with Ai denoting {bj j < i,j & 
{!,... , |0|}}) such that the following two conditions hold: 

(1) for alls G Stands' G S'0, if sRqs', then o{s) = o'(s') and for all p G V, S0{s,p)R\s\d'0{s' ,p), 
and 

(2) for all i = 1, . . . , \B\, for all s G SAi and s' G S'^., if sRis', then for all / G lit{bi), 
6A,{s,l)Ri.i6'j,^{s',l). 

The sense in which pseudo-bisimulation is weaker than a bisimulation is that there need not 
be a relation for each element of p{B). As the following theorem shows, however, we can always 
complete a pseudo-bisimulation to a bisimulation. 

Theorem 4.1: If {Ri)i=o^,„ '■^ ^ pseudo-bisimulation (relative to the ordering h\,... , of the 
primitive tests in B), then there exists a bisimulation {R'a) such that R'j^. = Ri for all i = 0, . . . ,\B\ 
(with Ai denoting {bj : j < i, j E {1, . . . , l-BI}}). 
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Proof: Let {Ri)i=o,... ,\b\ be a pseudo-bisimulation (relative to the ordering on primitive tests 
61, . . . We define a family of relations i?^ ^ Sa x S'^ for each A G p('B), and show 

that it forms a bisimulation with the required property. The proof relies on the path independence 
condition A2 of mixed automata in a fundamental way. Given A G p{B), let i{A) be the largest i G 
{1,... , such that {61, .. . ,6j} C ^,andletc(^) be the relative complement of {61, .. . 
defined by \ {61 , . . . , }■ We say that a sequence of literals h, . . . , 4 is exhaustive over a set 
of bases ^ if yl = {base{li), . . . , base{lk)} and |^| = k. Define R'y^ as follows: sR'^s' holds if and 

only if for all literal sequences h, . . . ,lk exhaustive over c{A), we have s . . . si, s' — ^ 
s[, and sii?j(yi)s']^. Clearly, if A = {61, . . . , 6j(^)}, then i?^ = as required. We now 

check that (-R^)Aep(B) is a bisimulation. Clearly, since R'0 = Rq, if sR'0s', then sRqs', and hence 
o(s) = o'(s'), and for all p eV,it holds that S0{s,p)R\s\S0{s',p), implying S0{s,p)RqS0{s' ,p). 
Now, let A 7^ 0, s G Sa, s' G 5*^, ^ G /ii(y4), and assume sR'^s'. Consider the following cases: 

Case A = {bi, . . . , base{l) = bi(^A)'- Since sR'^s', then and by the properties 

of pseudo-bisimulations, we have (5a(s, 0^t(A)-i'^A(s', /), which is exactly 6a{s, l)R'A\{base{l)}^^^^' ' 

Case A = {bi, . . . , base{l) = 6j, j < i(^): Since sR'j^s', then sRi(^A)s'- let Zi, . . . , Zfc 

be an arbitrary exhaustive sequence of literals over ■ ■ ■ , ^j+i}- Let Z^(^), • • • , Zj+i be the 

arrangement of Zi, . . . , Ik such that base{l!^) = bm- Consider the states si, S2, s'l, ^2 such that 

s . . . si — ^ S2, and s' . . . s'^ — ^ By the definition of pseudo-bisimulation, 

I A 

we have that S2i?j-iS2- Now, by condition A2, we have states S3, Sg such that s — > S3 

. . . S2 and s' — ^ Sg . . . s'2. By condition A2 again, we have that S3 . . . S2 
and S3 . . . S2. Since Zi, . . . ,lk was arbitrary, S2-Rj-iS2 and i(A \ {6ase(/)}) = j — 1, we 
have S3i?^\{6„,e(0}^3' that is, l)R'A\{base{l)}^A{s' , I). 

Case A D {61, . . . , base{l) G c(^)): Pick an arbitrary sequence /i, . . . J/t of Hterals 

that is exhaustive over c{A \ {base{l)}), and states si, S2, s^, S2 such that s — ^ S2 • • • si, 

and s' — ^ S2 — ^ . . . s'^. By definition of R'^, we have sii?j(4)s'^. Since the sequence of 
literals /i, . . . ,lk was arbitrary, and since ^(^4) = i{A \ {base{l)}), we have that S2R'A\{base{l)}^''^' 

that is, 5a{s, l)R'A\{base(l)}^A{s\ I). 

Case A D {61, . . . , ^^(yi)}, base{l) = 6j(^): Pick an arbitrary sequence Zi, . . . ,lk of literals 
that is exhaustive over c{A), and states si, s'^ such that s . . . si and s' . . . s'^. 
By definition of i?^, we have siRi(^A)Si- By definition of pseudo-bisimulation, if si — ^ S2 and 
s[ — ^ S2, then we have S2-Ri(A)-iS2- condition A2, we have that for states S3, S3, s — ^ 
S3 . . . S2 and s' — ^ S3 . . . 53. Thus, since Zi, . . . ,lk was arbitrary, and 
i(A \ {base{l)}) = i{A) - 1, we have S3i?A\{fea.e(0}^3' ^at is, Sa{s, 0^A\{fea.e(O}<^^(*'' 

Case A D {fti, . . . , ^i(A)}) base{l) = bj,j < i{A): Pick an arbitrary sequence Zi, . . . ,lk of 

literals that is exhaustive over c{A) U {^j^^^ , ■ • • , ^j+i }• Let Z'l , . . . , Z^, be the elements of Zi , . . . ,lk 

with bases in c(^). Let Z'/, .. . , Zj^'„ be the elements of Zi, .. . , Z^ with bases in .. . ,6j_|_i}.Let 

^i'cA)' ■ ■ ■ ' ^j+i b^ the arrangement of ... , l'^„ such that base{l!^) = bm- Consider states si, s'^ 

I' I' I I' I' I 

such that s . . . si and s' . . . s'^. By definition of R'^, we have siRi(^A)Si- Now, 
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//// //// //// 

• 1 //II + 1 I . / 'i + l , I , ^ 

consider states S2, S3, Sg) S3 such that si — > . . . — > S2 — > S3 and s\ — > . . . — > S2 — > S3. By 
the definition of pseudo-bisimulation, since sii?j(^)s'^, we have that s^Rj-is'^. Now, by condition 

A2, we have states S4, S4 such that s — ^ S4 . . . S3 and s' — ^ S4 . . . Sg. 
Since h,... ,lk was arbitrary, and i{A \ {base{l)}) = j — 1, we have S4-R^^|j^gg(;^|S4, that is. 

Let us say that two states s, s' are pseudo-bisimilar if they are related by some Ri in a pseudo- 
bisimulation {Ri); it follows directly from Theorem 4.1 that pseudo-bisimilar states are bisimilar. 

5 Mixed Expressions and Derivatives 

A mixed expression (over the set of primitive programs V and the set of primitive tests B) is any 
expression built via the following grammar: 

e ::= | 1 j p [ / | ei + 62 | ei • 62 | e* 

(with p € ■p and / G lit{B)). For simplicity, we often write 6162 for ei • 62- We also freely 
use parentheses when appropriate. Intuitively, the constants and 1 stand for failure and success, 
respectively. The expression p represents a primitive program, while I represents a primitive test. 
The operation + is used for choice, • for sequencing, and * for iteration. These are a subclass 
of the KAT expressions as defined by Kozen [1997]. (In addition to allowing negated primitive 
tests, Kozen also allows negated tests.) We call them mixed expressions to emphasize the different 
interpretation we have in mind. 

In a way similar to regular expressions denoting regular languages, we define a mapping M 
from mixed expressions to mixed languages inductively as follows: 

M(0) = 
M(l) = {6} 
M(p) = {p} 

M{1) = {{I}] 

M(ei + 62) = M(ei) U M(e2) 
M(ei • 62) = M(ei) • M(e2) 
M(e*) = IJ M(e)". 

n>0 

The mapping M is a rather canonical homomorphism from mixed expressions to mixed lan- 
guages. (It is worth noting that we have not defined any axioms for deriving the "equivalence" of 
mixed expressions, and it is quite possible for distinct mixed expressions to give rise to the same 
mixed language.) 

Inspired by a type system devised by Kozen [1998, 2002] for KA and KAT expressions, we 
impose a type system on mixed expressions. The types have the form A ^ B, where A,Be p{B), 
the same types we assigned to mixed strings in Section 2. We shall soon see that this is no accident. 
We assign a type to a mixed expression via a type judgment written \- e : A ^ B. The following 
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inference rules are used to derive the type of a mixed expression: 

\-0:A^B hl:A-^A h p : ^ B 

hl:AU {base{l)} A\ {base{l)} 



hei:A^B \-e2:A^B \- ei : A ^ B \- 62 : B C 

h ei + e2 : A -s- B \- ei ■ €2 : A ^ C 

e:A^A 
e* -.A^A ' 

It is clear from these rules that any subexpression of a mixed expression having a type judgment 
also has a type judgment. 

The typeable mixed expressions (which intuitively are the "well-formed" expressions) induce 
typeable mixed languages via the mapping M, as formalized by the following proposition. 

Proposition 5.1: If h e : A ^ B, then M{e) is a mixed language of type A B. 

Proof: A straightforward induction on the structure of mixed expressions. I 

Our goal is to manipulate mixed languages by manipulating the mixed expressions that represent 
them via the mapping M. (Of course, not every mixed language is in the image of M.) In particular, 
we are interested in the operations T{L) and e(L), as defined in Section 2, as well as the language 
derivatives Dp and Di introduced in the last section. 

We now define operators on mixed expressions that capture those operators on the languages 
denoted by those mixed expressions. We define T inductively on the structure of mixed expressions, 
as follows: 

f (0) = 
f (1) = 1 
f{p) = 

m = i 

r(ei + e2) = f(ei)+f(e2) 
f (ei • 62) = f (ei) ■ f (62) 
f (e*) = f (e)* 

(where p £ T and I G lit{B)). The operator T "models" the operator T{L), as is made precise in 
the following way. 

Proposition 5.2: If \- e : A ^ B, then T{e) is a typeable mixed expression such that T{M{e)) = 
M{f{e)). 
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Proof: A straightforward induction on the structure of mixed expressions. I 
We define e inductively on the structure of mixed expressions, as follows: 



e(0) = 

6(1) = 1 

e{p) = 
6(0 = 

e(ei + 62) = 

e(ei • 62) = 
6(e*) = 1 



ife(ei) = e(e2) 

1 otherwise 

1 if e(ei) = e(e2) = 
otherwise 



(where p e V and I G lit{B)). Note that e(e) is always the mixed expression or 1. In analogy to 
Proposition 5.2, we have the following fact connecting the e and e operators. 

Proposition 5.3: If \- e : A ^ B, then e(e) is a typeable mixed expression such that e(M(e)) = 
M(e(e)). 

Proof: A straightforward induction on the structure of mixed expressions. I 

Finally, we define, by induction on the structure of mixed expressions, the derivative operator 
D for typeable mixed expressions. There are two forms of the derivative, corresponding to the two 
forms of derivative for mixed languages: the derivative Di with respect to a literal / € lit{B), and 
the derivative Dp with respect to a primitive program p £ V. The two forms of derivative are 
defined similarly, except on the product of two expressions. (Strictly speaking, since the definition 
of the derivative depends on the type of the expressions being differentiated, D should take type 
derivations as arguments rather than simply expressions. To lighten the notation, we write D as 
though it took mixed expressions as arguments, with the understanding that the appropriate types 
are available.) 

The derivative Dp with respect to a primitive program p EVis defined as follows: 

bpio) = 
bp{i) = 

n / \ _ / 1 itp = q 
P^"^^ ~ \ otherwise 

bp{i) = 

bp{ei + 62) = -Dp(ei) + bp{e2) 

b (ei-e2) = | ^ if S 7^0 

P \ bp{ei) ■ 62 + e(ei) • bp{e2) otherwise 

where \- ei : A ^ B and \- 62 B ^ C 

bp{e*) = bp{e)-e*. 
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The derivative Di with respect to a Uteral I G lit{B) is defined as follows: 



Di{ei ■ 62) 



A(0) = 

A(i) = 
blip) = 

D(l>) = { 1 if^ = ^' 
\ otherwise 

A(ei + e2) = A(ei)+A(e2) 

A(ei) • 62 if base{l) ^ 5 

A(ei) • 62 + T'(ei) ■ A (62) otherwise 
where \- ei : A ^ B and \- 62 B ^ C 
A(6*) = A(e)-e*. 

We have the following proposition, similar to the previous two, connecting the derivative D to 
the previously defined derivative D on mixed languages. 

Proposition 5.4: Suppose that h e : A B. 

If A = 0, then for all p € V, Dp{M{e)) = M{bp{e)). 
If A ^ 0, then for all I G lit{A), Di{M{e)) = M{bi{e)). 

Proof: The proof is by induction on the structure of the mixed expression e. To illustrate the proof 
technique, we give one case of the proof. 

Suppose that \- ei : A ^ B and h 62 : B ^ C, and e = ei ■ 62- Suppose further that 
I G lit{B) is a literal such that hase{l) G A and base{l) G B. We will show that the proposition 
holds for the expression e, assuming (by the induction hypothesis) that the proposition holds for all 
subexpressions of e. 

We first establish three claims that will be needed. 
Claim 1: If t is a test which (as a mixed string) can be judged to have type A ^ B, then {t} ■ {a : 
{/} • a G M(e2)} = {a' : {1} ■ a' G {t} • M(e2)}. 

First suppose that cr is a mixed string such that {/} • a G M{e2). Then a can be judged to 
have type B \ {hase{l)} C, and so {/} ■ {t} ■ a = {t} • {/} • a G {t} ■ M(e2). It follows that 
t ■ a £ {a' : {1} ■ a' G {i\ ■ M(e2)}. For the other direction, suppose that a' is a mixed string such 
that {1} ■ a' G {t} ■ M(e2). Then there exists a mixed string r G M(e2) such that {1} ■ a' = {t} ■ r. 
Since t can be judged to have type A B and base{l) £ Ar\ B, hase{l) ^ t and there exists a 
mixed string a such that {/} • a' = {t} ■ r = {1} ■ {t} ■ a. Thus a' = {t} • a where {/} • a G M(e2). 
Claim 2: If a is a mixed sti-ing such that / • a G M(ei), then I - a € M(ei) \ r(M(ei)). 

This claim holds because {/} • cr G M(ei) implies that a has type A \ {base{l)} B; since 
B % A', by the definition of the type of a mixed string, |cr| > 1 and so \{l} ■ a\ > 1. 
Claim3: {a : {l}-a G M(ei)\r(M(ei))}-M(e2) = {a : {l}-a G (M(ei)\r(M(ei)))-M(e2)} 

The C direction is straightforward. For the ^ direction, let a be a mixed stting in the second 
set; then, there exist strings ri G M(ei) \ r(M(ei)) and T2 G M(e2) such that {/} • cr = n • T2. 
All strings in M{ei) have type A ^ B; since base{l) G B, there are no strings in M(ei) of length 
one consisting of a primitive program, and so |ri| > 3. Hence a = a' ■ T2 for some mixed string a' 
such that {1} ■ a' G M(ei) \ r(M(ei)). 
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Using these three claims, we show that Di{M{e)) = M{Di{e)): 



M(A(ei -62)) 

= M(A(ei) • 62 + T(ei) • A (£2)) (by definition of A) 

= M{Di{ei)) ■ M(e2) U M(f (ei)) • M(A(e2)) (by definition of M) 
= Di{M{ei)) ■ M(e2) U M(f (ei)) • D{M{e2)) (by induction hypothesis) 
= Di{M{ei)) ■ M{e2) U r(M(ei)) • D(M{e2)) (by Proposition 5.2) 
= {a: {l}-a€M{ei)}-M{e2)U 

r(M(ei)) • {cT : {/} • cj G M(e2))} (by definition of A) 

= {a : {/} ■ a G M(ei)} • M(e2) U 

{(7 : {Z} • a e r(M(ei)) ■ M(e2))} (by Claim 1) 

= {a:{l}-ae M(ei) \ T(M(ei))} • M(e2) U 

{a : {Z} ■ (T G r(M(ei)) • M(e2))} (by Claim 2) 

= {a : {0 • a G (M(ei) \ r(M(ei))) • M(e2)} U 

{a:{l}-(j£ T(M(ei)) • M(e2))} (by Claim 3) 

= {a:{l}-aeMiei)-M{e2)} 

= Di{M{ei) ■ M(e2)) (by definition of A) 

= Di{M{ei -62)) (by definition of M). 

The other cases are similar. | 



6 Example 

In this section, we use the notions of pseudo-bisimulation and the coinduction proof principle 
(Corollary 3.9), along with the derivative operator D, to prove the equivalence of two mixed lan- 
guages specified as mixed expressions. 

Fix V to be the set of primitive programs {p, q}, and B to be the set of primitive tests {b, c}. Let 
[6] be a shorthand for {b + b). Define a to be the mixed expression 

ibpi[b]cqrc)% 

and P to be the mixed expression 

bp{[b]cq + bcpyW + b. 

Our goal is to prove that a and /3 are equivalent, in the sense that they induce the same language 
via the mapping M. In other words, we want to establish that M{a) = M{f3). This example 
demonstrates the equivalence of the program 

while b do { 

p; 

while c do q 

} 
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and the program 
if b then { 

p; 

while b + c do 

if c then q else p 

> 

This equivalence is a component of the proof of the classical result that every while program can be 
simulated by a while program with at most one while loop, as presented by Kozen [1997]. We refer 
the reader there for more details. 

There are a few ways to establish this equivalence. One is to rely on a sound and complete ax- 
iomatization of the equational theory of KAT, and derive the equivalence of a and /3 algebraically 
[Kozen and Smith 1996]. Another approach is to first construct for each expression an automaton 
that accepts the language it denotes, and then minimize both automata [Kozen 2003]. Two expres- 
sions are then equal if the two resulting automata are isomorphic. 

In this paper, we describe a third approach, using the coinductive proof principle for mixed 
languages embodied by Corollary 3.9. Since the theory we developed in Section 3 applies only to 
mixed languages of type ^ ^ 0, we verify that indeed we have h a : {b} and h /3 : {b} — > 0, 
so that, by Proposition 5.1, M(a) and M{j3) are languages of type {h} 0. 

We prove the equivalence of a and [5 by showing that the mixed languages M{a) and M{[5) 
are pseudo-bisimilar, that is, they are related by some pseudo-bisimulation. More specifically, we 
exhibit a pseudo-bisimulation, relative to the ordering bi = 6, 62 = c, on the final automaton 
C, such that M{a) and M{f3) are pseudo-bisimilar. This is sufficient for proving equivalence, 
since by Theorem 4.1, the languages M(a) and M(/3) are then bisimilar, and by Corollary 3.9, 
M(a) = M{p). 

Define a' to be the mixed expression 

{[b]cq)*ca 

and define /?' to be the mixed expression 

{[b]cq + bcpYcb. 

Notice that (3 = bpf]' + b. 

We note that (using the notation of the definition of pseudo-bisimulation), ^0 = 0, Ai = {b}, 
and A2 = {b, c}. We claim that the following three relations form a pseudo-bisimulation: 

R2 = {(M{a'), M{P')), Ri = {{M{[b]qa'), M{[b]qP')), 
(M(0),M(0))} (M(a),M(/3))} 

Ro = {{Mipa'),M{pP')), 
{M{qa%M{q(3')), 
(M(1),M(1)), 
(M(0),M(0))}. 
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It is straightforward to verify that (i?o, -Ri, -R2) is a pseudo-bisimulation on C, using the opera- 
tors defined in the previous section. For instance, consider Di){M{oi)), which is equal to M{D^{a)) 
by Proposition 5.4. We compute Dh{a) here: 

b^{a) = £)5((6p([6]cg)*c)*)6 + f ((6p([6]cg)*c)*)A(6) 

= D,{bp{[b]cqrc)ibpi[b]cqrcrb + f ((6p([6]cg)*c)*)0 
= p{[b]cqrc{bp{[b]cqrcyb 
= pa'. 

Hence, Di,{M{a)) = M{Di,{a)) = M{pa'). The other cases are similar. 

As we shall see shortly, there is a way to mechanically construct such a bisimulation to estabhsh 
the equivalence of two mixed expressions. 

We remark that an alternative approach to establish equivalence of while programs based on 
coalgebras is described by Rutten [1999]. This approach uses the operational semantics of the 
programs instead of an algebraic framework. 

7 Completeness 

Thus far, we have established a coinductive proof technique for establishing the equahty of mixed 
languages (Section 3), and illustrated its use by showing the equality of two particular mixed lan- 
guages specified by mixed expressions (Section 6), making use of the derivative calculus developed 
in Section 5. A natural question about this proof technique is whether or not it can establish the 
equivalence of any two mixed expressions that are equivalent (in that they specify the same mixed 
language). In this section, we answer this question in the affirmative by formalizing and proving 
a completeness theorem for our proof technique. In particular, we show that given two equivalent 
mixed expressions, a finite bisimulation relating them can be effectively constructed, by perform- 
ing only simple syntactic manipulations. In fact, we exhibit a deterministic procedure for deciding 
whether or not two mixed expressions are equivalent. 

In order to state our completeness theorem, we need a few definitions. We say that two mixed 

ACI 

expressions ei and 62 are equal up to ACI properties, written ei = 62, if ei and 62 are syntactically 
equal, up to the associativity, commutativity, and idempotence of +. That is, ei and 62 are equal 
up to ACI properties if the following three rewriting rules can be applied to subexpressions of ei to 
obtain 62: 

e + {f + g) = {e + f)+g 
e+f=f+e 
e + e = e. 

Given a relation R between mixed expressions, we define an induced relation R^'-'^ as follows: 
eiR^'^^e2 if and only if there exists e[, 63 such that ei e[, 62 ^= e'g, and e'lRe^. 

We define a syntactic bisimulation between two mixed expressions ei and 62 having the same 
type B ^ (for some S C to be a family R = {Ra) A&p{B) of relations such that 

(1) for all mixed expressions e, e', if cRa^', then \- e : A ^ and \- e' : A 0, 
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(2) eRBc', 

(3) for all mixed expressions e, e', if eR0e', then e(e) = e(e'), and for allp € Dp{e)R^'^^Dp{e'), 
and 

(4) for all mixed expressions e, e', if e^Ac' (for A / 0), then for all/ € i);(e)^^^|j^^g^^^|i)i( 

A syntactic bisimulation resembles a bisimulation, but is defined over mixed expressions, rather 
than over mixed languages. The next theorem shows that any two equivalent mixed expressions are 
related by a finite syntactic bisimulation, that is, a syntactic bisimulation R where the number of 
pairs in each relation Ra is finite. 

Theorem 7.1: For all mixed expressions ei,e2, of type A ^ 0, M(ei) = M{e2) if and only if 
there exists a finite syntactic bisimulation between e\ and e^- 

Proof: (<;=) It is easy to check that a syntactic bisimulation R induces a bisimulation R such that 
e\RAe2 if and only if M{e\)RAM{e2)- The result then follows by Corollary 3.9. 

(=>) We first show how to construct, for every mixed expression e with \- e: Be, a finite- 

state automaton M = {{SA)Aep{B)^ {^A)Aep{B)) with transition functions 60 : S0 x V —>■ Sb and 
(for A 0) 6a '■ Sa X lit (A) — >■ Uyiep(B) satisfying the conditions 

(1) 5a{s,1) e SA\{base{l)}^ 

(2) the states of Sa are mixed expressions having type A —>■ Bg, 

(3) e is a state of Sa^ > 

(4) if 60(81, p) = S2, then Dp{si) ^= S2, and 

(5) if 6a{si, I) = S2, then S2. 

We define the automaton by induction on the structure of e. The cases for 0, Z are straightfor- 
ward. We focus on the remaining cases: 

Case e = ei + 62: Assume by induction that we have automata Mi, M2 for ei and 62. Define: 

Sa = {/i + /2 : /i G Si^A, /2 e S2,a} 

S0{fl + f2,p) = S0{fl,p) + 60{f2,p) 

SaHi + /2, = + S2,A{f2, l),{orAj^0,le lit{A). 

Case e = ei • 62: Let \- ei : Ai ^ Bi. Assume by induction that we have automata Mi,M2 for 
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ei and 62. Define: 



SA = {f-e2+ E t-g+ ■■ 

{t,g)eE geG 
f G Si,A, E C Tests{A ^ Bi) x 82,3^,0 C ^2,^} 

'50(/ • 62 + E 9,P) = 

geG 

<^1,0(/,P) •e2 + '52,0(e2,p)+ E <^2,0(5,P) if B = 0,i{f) = 1 

geG 

^i,0{f,p) • 62 + E h0i9,P) otherwise 
geG 

SAif-e2+ E + E 9,1) = 

it,g)eE geG 

' Si^Aif, I) -62+ E Di{t) • 5 + E '^2,^(5, if base{l) eA\Bi 

{t,g)eE geG 

<5i,A(/,0-e2+ E t-52,BA9,l)+T. ^2A9,l) iihase{l)(^A\JBi 
(t,g)eE geG 

SiAf, • 62 + f{f) ■ 52,B, (62, /)+ if base{l) G B^ 

E * ■ {9, + E <^2,a(5, 

{t,g)eE geG 

for A 0,1 e lit{A). 



Case e = e^: Let h ei : ^1 — > Ai. Assume by induction that we have an automaton Mi for ei. 
Define: 



Sa=< 



f {7-et+ E /-e^ : 7e{0,l},FCSi,Aj if^ = Ai 

{ E / • e! : C otherwise 

feF 

50(7-6^+ E /•et,p) = 

7 • (^1,0(61, • + E K^if^P) • + E • <^i,0(e,p) • el, 
feF feF 

for A = Ai 

feF feF 
^A{l-e\-r E /•el,0 = 

f^F 

7-<5M(ei,0-et+ E <JM(/,0-eI+ E • ^^(e, ' e^, 

for A / 0,^ = Ai,^ G Zit(^) 

/ • et, = • et, for A + 0,A + Ay,l^ U{A). 

feF feF 

It is straightforward (if tedious) to verify that the resulting automaton satisfies properties (l)-(5) 
given above. 

This completes the construction of the finite state mixed automaton corresponding to e. 

Given equivalent mixed expressions ei and 62 of type A ^ 0, a finite syntactic bisimulation 
R can be constructed as follows. First, construct the automata Mi and M2 corresponding to ei and 
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62. Then, initialize R to contain the pair (ei, 62), and iterate the following process: for every (e, e') 
in R, add the pairs ((5i,B(e, x), 82^3 (e', x)) (where e, e' have type S — >■ 0), for all x. Perform this 
iteration until no new pairs are added to R. This must terminate, because there are finitely many 
pairs of states (e, e') with e in Mi and e' in M2. It is straightforward to check that ^ is a syntactic 
bisimulation, under the assumption that M(ei) = M(e2). I 

The procedure described in the proof of Theorem 7.1 can in fact be easily turned into a procedure 
for deciding if two mixed expressions are equivalent. To perform this decision, construct R, and 
verify that at all pairs of states (e, e') in R, e(e) = e(e'). If this verification fails, then the two mixed 
expressions are not equivalent; otherwise, they are equivalent. 

The bisimulation in Section 6 is indeed a bisimulation induced by a syntactic bisimulation on 
the mixed expressions a and (3. 

8 Conclusions and Future Work 

We believe that proofs of equivalence between mixed expressions such as a and /3 via bisimulation 
are in general more easily derived than ones obtained through a sound and complete axiomatization 
of KAT. Given two equivalent mixed expressions, we can exhibit a bisimulation using the purely 
mechanical procedure underlying Theorem 7.1: use the derivative operators to construct a finite 
bisimulation in which the two expressions are paired. In contrast, equational reasoning typically 
requires creativity. 

The "path independence" of a mixed automaton (condition A2) gives any mixed automaton 
a certain form of redundancy. This redundancy persists in the definition of bisimulation, and is 
the reason why a pseudo-bisimulation, a seemingly weaker notion of bisimulation, gives rise to 
a bisimulation. An open question is to cleanly eliminate this redundancy; a particular motivation 
for doing this would be to make proofs of expression equivalence as simple as possible. Along 
these lines, it would be of interest to develop other weaker notions of bisimulation that give rise to 
bisimulations; pseudo-bisimulations require a sort of "fixed variable ordering" that does not seem 
absolutely necessary. 

Another issue for future work would be to give a class of expressions wider than our mixed ex- 
pressions for which there are readily understandable and applicable rules for computing derivatives. 
In particular, a methodology for computing derivatives of the KAT expressions defined by Kozen 
[1997] would be nice to see. Intuitively, there seems to be a tradeoff between the expressiveness 
of the regular expression language and the simpUcity of computing derivatives (in the context of 
KAT). Formal work towards understanding this tradeoff could potentially be quite useful. 
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